Bulletin of Electrical Engineering and Informatics 

Vol. 8, No. 2, June 2019, pp. 527-532 
ISSN: 2302-9285, DOI: 10.11591/eei.v8i2.1404 


□ 527 


Analysis of wavelet-based full reference image quality 

assessment algorithm 


Faizah Mokhtar 1 , Ruzelita Ngadiran 2 , Taha Basheer 3 , Amir Nazren Abdul Rahim 4 

1,2,3 School of Computer and Communication Engineering, Universiti Malaysia Perlis, Malaysia 
4 Faculty of Engineering Technology, Universiti Malaysia Perlis, Malaysia 


Article Info ABSTRACT _ 

Measurement of Image Quality plays an important role in numerous image 
processing applications such as forensic science, image enhancement, 
medical imaging, etc. In recent years, there is a growing interest among 
researchers in creating objective Image Quality Assessment (IQA) 
algorithms that can correlate well with perceived quality. A significant 
progress has been made for full reference (FR) IQA problem in the past 
decade. In this paper, we are comparing 5 selected FR IQA algorithms on 
TID2008 image datasets. The performance and evaluation results are shown 
in graphs and tables. The results of quantitative assessment showed 
wavelet-based IQA algorithm outperformed over the non-wavelet based IQA 
method except for WASH algorithm which the prediction value only 
outperformed for certain distortion types since it takes into account the 
essential structural data content of the image. 
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1. INTRODUCTION 

Digital images often pass through several processing stages such as acquisition, processing, storage 
and transmission before they reach to the observers [1]. These images are subjected to different kinds of 
distortions during the stages such as transmission, processing, acquisition and compression. These stages may 
resulting in degradation of visual quality of the images. For example, during the transmission stage, the 
quality of the received image may decrease because of dropping of some data due to limited bandwidth 
of the channels. 

Consecutively, it is significant for image acquisition, communication, processing systems and 
management to measure the quality of images at each stage. Hence, image quality assessment (IQA) is very 
important in order to maintain and conserve the quality of the images. In general, measurement of image 
quality usually can be classified into two categories, which are subjective and objective quality 
measurements [2]. 

Human visual system (HVS) is well adapted for this purpose as the main function of human eye is to 
extract structural information from the viewing field [3]. Therefore, the perfect method of quantifying image 
quality is through subjective evaluation. To evaluate this type of measurement, a number of observers are 
selected, tested for their visual capabilities, shown a series of test scenes and asked to score the quality of the 
scenes [4]. Nevertheless, subjective evaluations are time-consuming and expensive which makes them 
impractical for real-time applications [5]. 
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To eliminate the need for expensive subjective studies, numerous efforts have been made to develop 
objective measurement that can correlate with perceived quality. The goal of objective IQ A is to design 
algorithms that are able to predict the quality of an image automatically and accurately. 

1.1. Image quality assessment 

In general, measurement of image quality usually can be classified into two categories which are: 

a. Subjective measurement: A number of observers are selected, tested for their visual capabilities, shown 
a series of test scenes and asked to score the quality of the scenes [6]. It is the only “correct” method of 
quantifying visual image quality. However, subjective evaluation is usually too inconvenient, time- 
consuming and expensive. 

b. Objective measurement: These are automatic algorithms for quality assessment that could analyze 
images and report their quality without human involvement. Such methods could eliminate the need for 
expensive subjective studies. 

1.2. Objective image quality assessment 

Objective image quality metrics can be classified according to the availability of an original 
(distortion-free) image, with which the distorted image is to be compared. Most existing approaches are 
known as: 

a. Full-reference (FR): A complete reference image is available. 

b. Reduced-reference (RR): The reference image is only partially available, in the form of a set of 
extracted features as side information. 

c. No-reference (NR): The reference image is not available. 

1.2.1. Full reference method 

The most popular objective quality metric for image processing applications are Peak Signal and 
Noise Ratio (PSNR) and Structural Similarity (SSIM) [7]. Both method has been widely used by other 
researchers. Measurement methods usually consider the human visual system (HVS) characteristics to 
correlate with perceptual quality. PSNR is defined as 

PSNR = 10 log 10 — (1) 

MSE 

where MSE is the mean square error and L is the dynamic range of the pixel values. However this method 
does not correlate well with human perception of quality as it is only calculate the pixel difference between 
original and distorted image. 

SSIM is proposed by Wang et. al [8] to improve the traditional methods. SSIM is a method that 
measures the similarity between original and distorted image. Three image information is considered in this 
method which is luminance, contrast and structure. The comparison measures were defined as; 


1 0, y) = 

2fl X lly+ C 1 

H x + /iy + Cx 

(2) 

c (x, y) = 

2(J X (7y + C2 

(3) 

(T%+(Ty + C 2 

s(x,y) = 

(J X y + C 3 

(4) 



where x and y is two discrete non-negative signal, fi x , a x and a x a y is the mean of x, the variance of x and the 
covariance of x and y respectively. The resulting SSIM index equation is given by; 


SSIM (x, y) 


(2 ^i)(2<7 X y+C2) 

04+ Hy + ClK<T%+(Ty + C2) 


(5) 


where C ± and C 2 are constants. 

However, during recent decades, researchers have realize the advantage of wavelet properties to 
image processing. Many studies of image quality assessment algorithm that is based on wavelet transform 
has been proposed. Wavelet transform matches with the multi-channel model of the human visual system 
(HVS) [9]. It able to separate frequency and spatial domain hence suitable for perceptual analysis of 
HVS system. 
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In 2013, Reenu et. al proposed WAvelet-based SHarp features (WASH) that take into account the 
sensitivity of human vision to the sharp features of the image, the sharpness and zero-crossing [10]. The 
WASH metric can be calculated as 

WASH = a* h * ag~ A> (6) 

where a s h is the similarity in sharpness of the reference and distorted image while a zc is the final ratio value 
of the zero-crossing. The value of A is assumed as 0.8, so that to provide the higher geometric weight 
to sharpness. 

Furthermore in 2009, Rezazadeh et. al proposed a novel approach for computing and pooling SSIM 
in the discrete wavelet domain [11]. Similar to spatial SSIM metric, the proposed metric has the feature of 
boundedness. To acquire the final similarity score, it computes an edge similarity map and an approximation 
structural similarity map. It also introduce a contrast map in the wavelet domain for structural similarity maps 
pooling. The overall quality measure equation between X and Y is; 

WSSI (X, Y) = aS A + (1 - a)S E (7) 

where S A and S E is the approximation and edge similarity scores respectively while a is constant. 

Besides, a relatively new metric named Haar wavelet-based Perceptual Similarity Index (HaarPSI) 
has been proposed by Reisenhofer et. al [12]. This metric is based on the three stages coefficients of discrete 
Haar wavelet transform. The local similarities between reference and distorted images is measured by these 
coefficients. It uses six simple 2D Haar wavelet filters to detect vertical and horizontal edges. This metric can 
be appraised as a simplified interpretation of Feature SIMilarity index (FSIM). 


2. DATASET AND PERFORMANCE METRICS 

The TID2008 database is downloaded to be implemented during the evaluation of the selected IQ A 
metrics. This database contains 1700 test images (25 reference images, 17 types of distortions for each 
reference image) [13]. It had been subject to a wide range of distortions, including various types of noise, 
transmission errors, blur, JPEG and JPEG 2000 compression as well as contrast and luminance changes. 
TID2008 is intended for evaluation of FR IQA metrics. It allows estimating how a metric corresponds to 
mean human perception. Three commonly applied performance metrics are used for calculating the 
prediction monotonicity which are [14]: 

a. Spearman rank- order correlation coefficient (SROCC) 

b. Kendall rank-order correlation coefficient (KROCC) 

c. Pearson linear correlation coefficient (PLCC) 

For SROCC and KROCC, these two metrics can measure the prediction monotonicity of an IQA 
metric [15]. It also operate only on the rank of the data points and disregard the relative distance between 
data points. However, to compute PLCC, a regression analysis need to be utilized in order to provide a 
nonlinear mapping between the objective scores and the subjective mean opinion scores (MOSs). 


3. RESULTS AND ANALYSIS 

A comparison has been done between the five IQA metrics by simulating them using MATLAB 
software. The five FR IQA metrics are: 

a. Peak Signal to Noise Ratio (PSNR) 

b. Structurul Similarity Index (SSIM) 

c. Wavelet-Based Sharp Features (WASH) 

d. Wavelet Structural Similarity Index (WSSI) 

e. Haar wavelet-based Perceptual Similarity Index (HaarPSI) 

3.1. Discussions 

The scatter plots of MOS versus model predictions are shown in Figure 1, where each point 
represents one test image, with its vertical and horizontal axes representing its MOS and the obtained quality 
score, respectively. From Table 1 and 2, it can be concluded that WASH algorithm has the lowest prediction. 
This is due to WASH metric only provides best results for JPEG, JPEG2000 and GBLUR, since it takes into 
account the essential structural data content of the image. 
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Table 1 . Performance comparison of 5 IQA indices on TID2008 d atabase 



PSNR 

SSIM 

WASH 

WSSI 

HaarPSI 

SROCC 

0.5229 

0.6213 

0.1413 

0.7457 

0.9104 

KROCC 

0.3682 

0.4259 

0.0992 

0.5605 

0.7373 

PLCC 

0.4946 

0.6435 

0.0723 

0.7720 

0.9045 


Table 2. Overall performance ranking of IQA indices 


SROCC 

KROCC 

PLCC 

PSNR 

4 

4 

4 

SSIM 

3 

3 

3 

WASH 

5 

5 

5 

WSSI 

2 

2 

2 

HaarPSI 

1 

1 

1 



j 




(c) 


(d) 



Figure 1. Plotted graph of MSE vs obtained results for each algorithms, (a) PSNR algorithm, (b) SSIM 
algorithm, (c) WASH algorithm, (d) WSSI algorithm, (e) HaarPSI algorithm 
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Besides that, PSNR has relatively low prediction value. PNSR is a mathematical formula that 
measure the image quality based on the pixel difference between reference and distorted image. Although 
this metric is simple and easy to calculate, it ignores the features of human image perception and only 
calculate the difference of pixel in image. 

SSIM separates out the parameters in image which are luminance, contrast and structure. 
Additionally, this metric is applied locally using sliding window that moves pixel by pixel over the entire 
image. However, it is considered unstable measure and does not correlate well with subjective assessment as 
it is unable to accurately represent foreign object in an image if the structural information is not affected. 

WSSI has relatively high prediction value in this experiment. The reason is that most of the useful 
image information is concentrated in the first-level approximation sub band making the assessment become 
more accurate and precise. Haar wavelet is also used to reduce the complexity. However, this metric still 
suffer from high complexity as it is used on top of existing measurement while including a new contrast map 
function followed by approximation measurement that combine with edge quality measurement. 

HaarPSI achieve the highest prediction value in this experiment. The performance is relatively low 
when tested on image databases restricted to Gaussian blur. This limitation is due to this metric is entirely 
depend on high frequency information making it may too sensitive for distortions based on low-pass filtering. 
Nevertheless, it computed very efficiently and significantly faster than the other two metric. The used of Haar 
wavelet is potentially the simplest and most efficient in computation. Along with its less complex 
computational structure, this suggest HaarPSI can be applied in real optimization tasks. 


4. CONCLUSION 

Image quality assessment plays an important role in various image processing application. A 
numerous effort has been made by researchers to develop objective IQ A metrics. The results of quantitative 
assessment showed wavelet-based IQA algorithm outperformed over the other two methods. Although 
WASH algorithm has the lowest prediction among the all IQA, the prediction value for JPEG, JPEG2000 and 
GBLUR distortion surpassed PSNR method. In this experiment, it can be proved that including wavelet in 
image quality assessment process can improve the accuracy and complexity of an algorithm. 
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